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Comment: The Place of Death in the 
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1. MISTAKING AN OUTCOME FOR A 
COVARIATE 

Donald Rubin's lucid discussion of censoring by 
death comments on several issues: he warns against 
mistakes, describes obstacles to inference that might 
be surmounted within a given investigation, and dis- 
cusses barriers to inference that direct attention to 
new data from outside the current investigation. Cen- 
soring by death creates outcomes that are defined 
only contingently, such as quality of life defined only 
for survivors. If the contingency is an outcome of 
treatment — if survival could be affected by the 
treatment — then, as Rubin demonstrates, it is a se- 
rious analytical mistake to act as if the contingency 
were a covariate, a variable unaffected by treatment, 
when studying the effect of the treatment on the 
contingently defined outcome. This is one instance 
of a family of interlinked errors in which an analysis 
uses an outcome of treatment as if it were a covari- 
ate measured before treatment. Other instances in 
this same family are adjusting for an outcome as if it 
were a covariate (Rosenbaum, 1984), or attempting 
to define an interaction effect between a treatment 
and an outcome of treatment (Rosenbaum, 2004). 
One of the several advantages of defining outcomes 
of treatment as comparisons of potential responses 
under alternative treatments (Neyman, 1923; Ru- 
bin, 1974) is that it becomes difficult to make these 
mistakes: outcomes exist in several versions depend- 
ing upon the treatment, whereas covariates exist in 
a single version. 

Figure 1 depicts the mistake Rubin warns against. 
It is a simulated randomized experiment, with N = 




Treated, n=309 Conirol, n=214 

Excludes 1 6 Treated Deaths and 1 1 1 Control Deaih? 

FlG. 1. Comparison of quality of life. 

650 subjects, of whom n = 325 were randomized to 
treatment where 16 died, and m = 325 were ran- 
domized to control, where 111 died, and Figure 1 
depicts quality of life scores for survivors. Begin- 
ning with the structure as Rubin develops it, I will 
propose a somewhat different analysis. In Section 2 
notation describes a completely randomized experi- 
ment of the type depicted in Figure 1, with censoring 
by death but without covariates; then Section 3 pro- 
poses a method of analysis that separates empirical 
evidence of treatment effect from diverging patient 
preference orderings of death and various qualities 
of life. 
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2. CENSORING BY DEATH IN A 
COMPLETELY RANDOMIZED EXPERIMENT 

There is a finite population of N subjects, i = 
1,...,N, who have given informed consent to be 
randomized to receive either the treatment condi- 
tion or the control condition, where subject i would 
exhibit response tt% under treatment or response rci 
under control. Write 1Z for {(rxi, rci), i = l,. . . , N} 
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for the potential responses of the iV subjects, which 
are fixed features of the finite population of N sub- 
jects. Of the N subjects, a fixed number, n, with 1 < 
n < N, are picked at random for treatment, denoted 
Zi = l, the remaining m = N — n receiving control, 
denoted Z\ = 0, so that n = J2iLi Zi, an d all (^) 
possible treatment assignments Z = (Z\, . . . , Z^) T 

have the same probability (^) . The response, Ri, 
actually observed from i is r?i if Z% = 1 or rci if 
Zi = 0, and the observed data are O = {(Ri, Zi), i = 
1,...,N}. Here 1Z is fixed but O is random, and the 
distribution of O is created from 1Z by the known 
probability distribution used in the random assign- 
ment of treatments. The task of inference in a com- 
pletely randomized experiment is to say something 
about the effects caused by the treatment, 1Z, from 
the observed data, O, and the known distribution of 
treatment assignments. This commonplace descrip- 
tion of a randomized experiment is found, for in- 
stance, in Welch (1937), and it merged certain ideas 
from Fisher (1935) about randomization inference 
and certain ideas from Neyman (1923) about treat- 
ment effects. 

Following Rubin's approach, I will understand "cen- 
soring by death" to mean that the response is a nu- 
merical measure of "quality of life" at a particular 
time, say a year, after treatment, taking values in 
a subset Q of the real line, but the measure is not 
defined if the subject has "died" before that time, 
in which case the letter "D" appears in place of 
the numerical measure, so (rxi, ra) could be a pair 
of numbers, a D paired with a number, a number 
paired with a D, or a pair of D's. The mistake men- 
tioned in Section 1 consists in setting aside the D's 
when studying quality of life, and as Rubin's dis- 
cussion makes very clear, setting aside the deaths 
means not estimating the effects of the treatment 
on quality of life. 

It is sometimes the case that deaths can be com- 
pared ordinally to various qualities of life, even though 
numerical comparisons are not possible; that is, QU 
{D} may be a totally ordered set, with strict in- 
equality -< and with equality-or- inequality but 
the elements of Q U {D} cannot be manipulated 
arithmetically to yield averages or expected values. 
One common view, perhaps the default view, might 
order death as inferior to any quality of life, and 
that view might have such diverse sources as re- 
ligious teachings or the very different observation 
that a living person can end his or her life, so re- 
maining alive with a given quality of life reveals a 



preference for that quality of life over death. This 
common or default view is, no doubt, not universally 
held, and a particular person might order death or 
D as preferable to the lowest or worst qualities of 
life in Q. The analysis that I will describe can ac- 
commodate any total ordering of Q U {D}; it need 
not place D below all of Q. Faced with diverse pref- 
erences among different patients, one can carry out 
the proposed analysis with several different place- 
ments of D in QU {D}, in which case the empirical 
results of a single experiment might speak differently 
to different patients, and each patient could select 
the analysis that corresponds to that patient's own 
evaluation. This is illustrated in Section 3. When a 
total ordering of QU{D} is possible, to what extent 
does it facilitate inferences about the effects caused 
by treatments? More abstractly, what can be said 
about treatment effects when outcomes take values 
in a totally ordered set that lacks algebraic opera- 
tions? 

3. THE QUALITY OF LIFE AMID DEATH 

In the randomized experiment, we observe n of 
the N potential responses to treatment and we do 
not observe m of the N potential responses to treat- 
ment, and we observe m of the N potential responses 
to control, but we do not observe n of the N po- 
tential responses to control. Let Rt(i) ^ Rt(2) ^ 
• • • ^ Rr(n) denote the ordered, observed responses 
to treatment, including J;he D's, for the n treated 
subjects, Zi = 1, and let Rt(\) di Rt(2) ^ " * ■ ^ Rrtm) 
denote the unobserved, ordered responses to treat- 
ment for the m control subjects, Z{ = 0. In Figure 
1, there are 16 D's observed in the treated group, 
so 16 of the R T h\'s are D's, and if deaths are placed 
below any quality of life by ^, then the Rr(i) = • • • = 
#T(i6) = D. Similarly, let R c{1) < R c{2 ) Rc(m) 
denote the ordered, observed responses to ^ontrol 
for the m control subjects, Zi = 0, and let Rc(i) ^ 

Rc{2) ^ • • • ^ Rc{n) denote the ordered, unobserved 
responses to control for the n treated subjects,_Zj = 
1. Note that, although 1Z is fixed, the Rr(i), ^T(j)i 
Rc(k) an d Rc(£) are random variables with distri- 
butions created from 1Z by random assignment of 
treatments; moreover, the Rth), ^T(j)i Rc(k) an d 
Rc(£\ may be numbers in Q or the letter D. 

Fix an i, 1 < i < n, and consider the bivariate ran- 
dom vector Y(j) = {R T ^, R c ^). Here, RrU) is the 
observed ith largest response of the n responses of 
the n subjects randomly assigned to treatment, and 
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Rcu) is the unobserved ith largest response that 
would have been observed from these same n sub- 
jects had they all received the control instead, and 
either coordinate of Tn^ may be a D. If n were odd 

and i = (n + l)/2, then = (R T ^, Rc(i)) would 
compare the median of the n observed responses, in- 
cluding deaths, to treatment among n treated sub- 
jects to the median of the n unobserved responses, 
including deaths, that would have been observed 
among these same n treated subjects had they all re- 
ceived control instead of treatment. Notice carefully 
that there may be no individual i with (rTi,rci) = 
(Rt(i), Rc{{)), and the quantity R T(i) - R C (i) is not 
generally defined because either Rx(i) or Rc(i) may 
equal D. 

Because Rc(i) is not observed, Y/j) too is not ob- 
served. An exact, randomization-based confidence 
set for Y(j) will now be defined. Recall that is a 
1 — a confidence set for an unobserved random vec- 
tor Y(£) if (i) is a function of the observed data, 
O, and (ii) I- a <Pr{Y (i) eC (i) }; see Weiss (1955). 
Proposition 1 rephrases a result due to Fligner and 
Wolfe (1976, page 83, B; 1979); see Remark 2 follow- 
ing the proposition. The confidence set for Y(j) is 

the observed Rx(i) and an interval for Rc(i) formed 
from two of the observed Rc(j)S- Notice that the 
interval may have one or both endpoints D. 

Proposition 1 (Fligner and Wolfe, 1976, 1979). 
If 1 < a <b <m are two integers such that 

j=a Km) 

then C ( j) = {(R T{i ) ,w):we [R C ( a ) , Rc{b)]} is al-a 
confidence set for Y/j) . 

Remark 2. Fligner and Wolfe (1976) derive a 
prediction interval for an order statistic from a fu- 
ture sample starting from i.i.d. sampling of an infi- 
nite population, but it is straightforward to derive 
their combinatorial result, namely their Corollary 
4.1 in Fligner and Wolfe (1976), from random assign- 
ment of treatments in a finite population, and from 
this, the coverage of their prediction interval follows. 
Specifically, (i) start by assuming the iV fixed, or- 
dered responses to control are untied, rem ~< r c{2) ~< 

-*r CW i (ii) then, (tj"^ -1 ) ° f Q 
possible random assignments Z produce Table 1, 
yielding (1) in agreement with Corollary 4.1 in Fligner 
and Wolfe (1976); (iii) finally, note with Fligner and 



Wolfe (1976, page 84) or by other methods that ties 
among the rc% make the prediction interval conser- 
vative. 

At first, adopt the default view, that places death 
or D below all qualities of life in Q. Then, in Fig- 
ure 1, there are n = m = 325 subjects in each group, 
and the Rt(%) = D for i = 1, . . . , 16, Rc(j) = D for % = 
1, . . . , 111. With i = (n + l)/2 = (325 + l)/2 = 163, 
the median observed response in the treated group is 
i? T(163) = 4.19. With a = 138, b = 189, expression (1) 
equals 0.951, and [Rc( a )i Rc(b)] = [3-81, 4.16], so the 
95% confidence set for Try is Cu\ = {(4.19, w) :w € 
[3.81, 4.16]}. This 95% confidence set excludes the 
possibility that, taking account of the unequal death 
rates, the median quality of life score would have 
been higher for the n = 325 treated subjects had 
they all received the control instead, despite the ap- 
pearance of Figure 1. 

Table 2 gives C (i) for i = 41, 82, 163, 244 and 
285, for the eighth's, quartiles and median. Notice 
that for i = 82 for the lower quartile, the 95% confi- 
dence set contrasts the observed lower quartile in the 
treated group, Rt(82) = 3.49, to an interval [Rc(qi)-> 
Rc(i06)] = [D, D\i so with 95% confidence the lower 
quartile of the treated group would have been "death 
had all n treated subjects received control. 

Consider now a hypothetical patient who views 
qualities of life greater than or equal to 3.5 as bet- 
ter than death, but qualities below 3.5 as inferior to 
death. What does the same randomized trial say to 
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such a hypothetical patient with these hypothetical 
preferences? In Figure 1, there are 66 treated pa- 
tients and no control patients with qualities below 
3.5. As a result, with this placement of D in QL){D}, 
the Rc(j) are unchanged, but the Rrp{i) reflect the 
new order, with RrU) ~< D -< 3.5 for i = 1, . . . , 66, 
R T{i) = D -< 3.5 for i = 67, ... ,82, and D -< 3.5 H 
for i = 83, . . . , n = 325. Then C( 41 ) = {(3.23, w) : 
u> € [D, D]} where 3.23 -< D so, with 95% confidence, 
the lower eighth is worse if all n treated subjects had 
received control, but C( 82 ) = {(D, w) :w € [D, D]} so 
the lower quartiles would be the same, and the re- 
maining three intervals in Table 2 are unchanged. 
With the default order, treatment appeared supe- 
rior, but with the hypothetical order, control ap- 
pears better at the lower eighth and worse at the 
median. 

Perhaps there is a correct placement of death, D, 
amid the possible qualities of life, Q, or perhaps 
not. Certain religious teachings would place D be- 
low all of Q, but that view is not universal: Seneca 
(49 A.D., page 92), wrote: "He will live badly who 
does not know how to die well." The randomized ex- 
periment in Section 2 provides no new insight into 
the proper placement of D in QL){D}. However, for 
each given placement of D in Q U {D}, the experi- 
ment provides information about how a group of n 
people will fare under treatment and under control. 
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